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Abstract 

For graphs G and H, an H-colouring of G (or homomorphism 
from G to H) is a function from the vertices of G to the vertices of H 
that preserves adjacency, //-colourings generalize such graph theory 
notions as proper colourings and independent sets. 

For a given H, k E V(H) and G we consider the proportion of 
vertices of G that get mapped to A; in a uniformly chosen //-colouring 
of G. Our main result concerns this quantity when G is regular and 
bipartite. We find numbers < a~(k) < a + (k) < 1 with the property 
that for all such G, with high probability the proportion is between 
a~(k) and a + (k), and we give examples where these extremes are 
achieved. For many H we have a~(k) = a + (k) for all k and so in 
these cases we obtain a quite precise description of the almost sure 
appearance of a randomly chosen //-colouring. 

As a corollary, we show that in a uniform proper q-colouring of a 
regular bipartite graph, if q is even then with high probability every 
colour appears on a proportion close to 1/q of the vertices, while if q 
is odd then with high probability every colour appears on at least a 
proportion close to l/(q+l) of the vertices and at most a proportion 
close to l/(q — 1) of the vertices. 

Our results generalize to natural models of weighted //-colourings, 
and also to bipartite graphs which are sufficiently close to regular. As 
an application of this latter extension we describe the typical structure 
of //-colourings of graphs which are obtained from n-regular bipartite 
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graphs by percolation, and we show that p = 1/n is a threshold func- 
tion across which the typical structure changes. 

The approach is through entropy, and extends work of J. Kahn, 
who considered the size of a randomly chosen independent set of a 
regular bipartite graph. 

1 Introduction and statement of results 

Let G = (V(G), E(G)) be a simple, loopless, finite graph, and let H = 
(V(H),E(H)) be a finite graph without multiple edges but perhaps with 
loops. An H -colouring of G, or homomorphism from G to H, is a function 
from V(G) to V{H) that preserves adjacency. The set of //-colourings of G 
is thus 

Hom(G,iJ) = {/ : V(G) ->• V{H) : uv G E(G) f(u)f(v) G E(H)}. 

if -colourings generalize a number of important graph theory notions. For 
example, when H is the complete graph on q vertices, Hom(G, H) coincides 
with the set of proper g-colourings of G, and when H consists of two vertices 
joined by an edge, with a loop at one of the vertices, then Hom(G, H) may 
be identified with the set of independent sets of G, via the preimage of the 
unlooped vertex. 

if-colourings have a natural statistical physics interpretation as configu- 
rations in hard- constraint spin models. Here, the vertices of G are thought of 
as sites that are occupied by particles, with edges of G representing pairs of 
bonded sites. The vertices of H are the different types of particles (or spins), 
and the occupation rule is that bonded sites must be occupied by pairs of 
particles that are adjacent in H. A legal configuration in such a spin model 
is exactly an //-colouring of G. 

From the statistical physics standpoint, there is a very natural family 
of probability distributions that can be put on Hom(G, H). Fix a set of 
positive weights A = {Aj : % G V(H)} indexed by the vertices of H. We 
think of the magnitude of as measuring how likely particle k is to appear 
at each site. This can be formalized by giving each / G Hom(G, H) weight 
WkU) = lLey(G) X f(v) and probability 



where Z^(G,H) = J2feHom(GH) w ^(f) * s the appropriate normalizing con- 
stant or partition function of the model. For an introduction to statistical 
physics spin models from a combinatorial perspective, see for example [3]. 

The question to be addressed in this paper is the following. What can be 
said about an / that is drawn from Hom(G, if) according to the distribution 
Pa? Specifically, for each / 6 Hom(G, H) and k e V(H) set 

.(*,/) lrWI 



and 



\V(G)\ 



' ^ »6V(G) 



The aim of this paper is to give fairly precise estimates for p\(k) and the 
distribution of s(k, f) for / chosen according to p\, when G is bipartite and 
either regular or sufficiently close to regular. 

The point of departure for this work is a result of Kahn on the hard-core 
model. When H = H ind with V(H md ) = {0,1} and E(H ind ) = {00,01}, 
the set of vertices of G mapped to 1 forms an independent set in G, and 
Hom(G, ifind) can be identified with 1(G), the set of independent sets in G. 
For each A > 0, the hard-core model on G is the probability distribution 
hc(A) on 1(G) that assigns to each / e 1(G) a probability proportional to 
A' 7 '. One of the oldest and most studied spin models in statistical physics, 
this is a simple mathematical model of the occupation of space (represented 
by G) by particles of non-negligible size. The model can easily be realized 
as a spin model of the kind described above by assigning weights Ao = 1 and 
Ai = A to the vertices of H ind . 

Kahn [7] studied this model on a regular bipartite graph G. He proved 
that for all fixed A > 0, the model exhibits a phase coexistence in the sense 
that if G has equipartition £ U O then most he (A) independent sets tend 
to come either mostly from £ or mostly from O, in the sense that the size 
of an independent set chosen according to hc(A) is concentrated close to 
A/ (2(1 + A)), which is exactly the expected size of an independent set chosen 
according to the distribution that half the time picks a hc(A) independent set 
from £ and half the time picks from O. The following theorem ([7J Theorem 
1.4 Sz Corollary 1.5]) formalizes this. 
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Theorem 1.1 Let X > be fixed. There are positive constants C\, c 2 , C3 
and C4 ( depending on X) such that for every d-regular bipartite graph G on 
N vertices, the following two statements hold. Firstly, for every e > ci/yd, 
if I is chosen from T(G) according to the distribution hc(A) then 



Pr 



XN 



Secondly, 



where 



2(1 + A) 



E(\I\) 



> eN^j < c 2 e- 1 2- C3£2N . 



X 



N 2(1 + A) 



<c 4 C 



C = max 



1 losN 



Vd V N 



(1) 



In particular, a uniformly chosen independent set (A = 1) from a regular 
bipartite graph consists, with high probability, of close to one quarter of the 
vertices. While this corollary may seem more natural than the formulation 
of Theorem II. 1[ it is worth noting that in order to prove the theorem in the 
special case of A = 1 it is necessary (at least using the entropy methods of 
[?]) to pass to the more general weighted model first. Similarly, it might 
seem more natural in the present paper to focus on the structure of uniform 
if-colourings, but we are unable to obtain any results without introducing 
weights. 

From ([1]) we see that Theorem 11.11 only gives a concentration result when 
we consider families of graphs with d going to infinity. This is not just an 
artifact of the proof. For families of graphs with d fixed (and only N going 
to infinity), the behavior of E(\I\)/N depends very much on the particular 
choice of family. As an example, consider the case d = 2. If Gn is the disjoint 
union of N/4 copies of the cycle C4, and / is chosen uniformly from I is chosen 
uniformly fromX(G), then E(\I\)/N is easily seen to be concentrated close to 
2/7. If, however, G^ is the disjoint union of N/6 copies of the cycle C 6 , then 
E(\I\)/N concentrated close to 5/18. For this reason we implicitly assume 
throughout that d going to infinity. 

We now set up some notation that allows us to state our main result, 
which is an extension of Theorem 11.11 to arbitrary weighted if-colourings. 
From now on, whenever H and A are mentioned, it will be assumed that H 
is a finite graph without multiple edges but perhaps with loops, and that A 
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is a set of positive weights indexed by the vertices of H . For A, B C V(H) 
write A ~ B if for all it G A and i> G B we have m; G E(H), and set 

77a(-H") = max{w A (A)w A ( J B) : A ~ 5} 

where w A (-) = ^2 ie . Aj. Then set 

M A (H) = {(A, B) G V{Hf :A~B, w A (A)w A (B) = V a(H)}. 

Next define 

max {w A {A)\ k l {keB } + w A (B)\ k l {keA} : (A, B) G M A {H)} 



aX(k) 



2va{H) 



and define a^(k) similarly, with max replaced by min. Note that if k does 
not appear in any (A, B) G M. A {H) then a\~(k) = and that if there is a 
pair (A, B) G M. A (H) in which k does not appear then a A (k) = 0. Note also 
that a^ik) < a\{k). Finally, note that a\{k) and a~^{k) both take the form 



2w A (A) 2w A (B) 

for some (A, B) G M A (H). We may interpret this quantity as the expected 
proportion of vertices mapped to k in a pA-chosen if-colouring subject to the 
condition that all vertices from one partition class of G get mapped to A and 
all from the other class get mapped to B; we will refer to such a colouring 
as a pure-(A, B) colouring. Finally, for every e > and k G V(H) define 

I k (e) = [0,a A (k)-s)U(al(k)+E,l]. 



Before stating our main result, we motivate it by considering weighted H- 
colourings of K djd , the complete bipartite graph with d vertices in each class, 
for some fixed H and A. The adjacency structure of K d ^ d ensures that all H- 
colourings are pure-(A, B) for some (A, B) with A ~ B, and that moreover 
all but a vanishing proportion (in d) of Z A (K dtd , H) comes from pure-(A, B) 
colourings for some (A, B) G Ai A (H). It follows that for each k G V(H), in 
an H-co louring chosen according to p A we have that with probability 1 — o(l) 
the proportion of vertices of K d ,d mapped to k will be between a A (k) — o(l) 
and a^(k) + o(l). Our main result, which we now state, asserts that this 
property of K djd is essentially shared by all cf-regular graphs. 
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Theorem 1.2 Fix H and A. There are positive constants c\, c 2 , C3 and C4 
( depending on H and A ) such that for every d-regular bipartite graph G on 
N vertices, the following two statements hold. Firstly, for every e > ci/y/d 
and k G V(H) we have 

PK{s{kJ)eI k {e))<c 2 e- 1 2- c ^ N . (2) 
Secondly, for each k G V(H) we have 

p A (k) G [al(k) - c 4 C, a\(k) + c 4 (] (3) 
where ( is as defined in (Qp. 

In other words, for regular bipartite G the distribution p A is concentrated 
on if-colourings for which, for every k G V(H), the proportion of vertices 
mapped to k is roughly between a A (k) and a A (k). 

The proof of Theorem 11.21 goes along the following lines. We upper bound 
the contribution to Z A (G, H) from those / G Hom(G, H) with \ f~ 1 (k)\/N = 
7 > a + (k)+e by Z^k,s)(G, if)/(l + 5) 77V for some suitably small 5 > (where 
A(k, 5) is obtained from A by multiplying by 1 + 5 and leaving all other A, 
unchanged). We in turn upper bound Zx{k,5){G, H) using a result of Galvin 
and Tetali j6] to the effect that for all H and A and all (i-regular graphs G 
on vertices we have 

Z A (G,H) < Z A (K d4 ,H)M (4) 

where K^d is the complete bipartite graph with d vertices in each partition 
class. We upper bound Z A ^ k s)(K djd , H) in terms of r] A ^ kj s)(H), and in the 
end we get, using our choice of a\(k) and for some sufficiently small 5, an 
upper bound on the contribution that is significantly smaller than a trivial 
lower bound on Z\(G, H), showing those / G Hom(G, H) with |/ _1 (/c)|/A^ > 
a + (k) + e do not contribute greatly to the partition function. The same 
strategy works for \ f~ l (k)\/N falling significantly below a~(k). The details 
(in the more general setting of Theorem 11.61) are given in Section [31 

When a~^(k) = a~^(k) for all k, we obtain a single vector around which 
(s(k, f) : k G V(H)) is concentrated for / chosen according to p A . 

Corollary 1.3 Fix H and A. Suppose that for all k G V(H) there is an a A (k) 
such that a~^{k) = a\{k) = a A (k). Then there are positive constants c\, c 2 , C3 
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and C4 ( depending on H and A ) such that for every d-regular, bipartite graph 
G on N vertices the following two statements hold. Firstly, for e > c\j\fd 
we have 

Pa {\\(s(kJ)) keV (H) - (a A (A;)) fceyw || oo > e) < c 2 e- x T c ^ n . 
Secondly, we have 

\\{Pk{k))k&v(H) ~ (aA^JkevwIL < C 4C 

with ( as in (QJ). 

A situation in which Corollary 11.31 applies is when either M.^{H) = {(A, A)} 
or M A (H) = {(A,B),(B,A)} (for some A ^ B). This is in a sense the 
generic situation. Indeed, for every H, if the weights A» are chosen from 
any continuous distribution supported on {x G R^^l : x > 0}, then with 
probability 1 we will have Ai\(H) of the form described. As we will see in 
Example C below, Corollary II .31 also applies in some other natural situations. 

The gap between a^(k) and a\{k) (if there is one) cannot be closed in 
general, as the first part of the following theorem shows. 

Theorem 1.4 Fix H and A. There is a family {G d}dL\ of d-regular bipartite 
graphs, a function g{d) = o(l) and a positive constant c (depending on H 
and A) such that for each k G V(H), 



Pa ( 
Pa ( 



s(k,f) - a\{k) 
s(k,f) - a^(k) 



< 9(d)) 

< 9(d)) 



>c-g{d). 



There is also a family {G' d }'£L 1 of d-regular bipartite graphs, a function g(d) = 
o(l) and (for each k G V(H)) an a&(k) satisfying a~^(k) < a^(k) < a\{k) 
such that for each k, 

Pa (\ s (k, f) - a A (k)\ < g(d)) > 1 — g(d) 

and 

\p A {k) - o A (A:) I < g(d). 

We prove Theorem II .41 in Section [51 The graphs Gd we exhibit will be suitably 
chosen random regular graphs, and we will use the expansion of these graphs 
to show that all but o(l) of p\ is concentrated on pure-(A, B) colourings 
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for (A, B) G Ai\(H). The graphs G' d will be disjoint unions of complete 
bipartite graphs on 2c? vertices. Basic concentration estimates together with 
the independence of the components will give the claimed result. 

We now explore the consequences of Theorem II .2l for some specific choices 
of H and A. 

Example A (Hard-core model) Let H = if in( j be as described earlier, with 
Ao = 1 and Ai = A. We have seen that an element of Hom(G, H iQ ,i) chosen 
according to p\ is a configuration in the hard-core model on G with activity 
A. With these choices we have M A (H- md ) = {({0}, {0, 1}), ({0, 1}, {0})} and 

a7(l) = at(l) = — — 

av ) av J 2(1 + A) 

and so Theorem 11.21 indeed generalizes Theorem ll.il as claimed. 

Example B (Multistate hard-core model) Let H = H k be the graph on 
vertex set {0, . . . , k} with ij 6 E(H) if and only if i + j < k, and A, = A* 
for some fixed A > 0. An element of Hom(G, H k ) chosen according to p\ is 
exactly a configuration of the multistate hard-core (or multicast communica- 
tions) model on G with activity A. This model allows multiple particles (up 
to and including k) at each site, with the restriction that there are no more 
than k particles in total across each edge. A generalization of the hard-core 
model (the case k = 1), it has been studied in a variety of contexts: in com- 
munications [32], statistical physics pU] and combinatorics [5j. For k even 
the unique pair (A, B) e Ai^(H k ) has A = B = {1, . . . , k/2}, while for k odd, 
say k = 2£ + l, we have M A (H k ) = {(A, B), (B, A)} with A = {1,...,£} and 
B = {1, ...,£ + 1}. In either case Corollary 11.31 shows that for this model 
(s(k, f) : k G V(H)) is concentrated close to a single value for / chosen 
according to p\. 

Example C (Uniform proper g-colourings) Let H = K q , the complete graph 
on q vertices, and A = (1, . . . , 1). An element of Hom(G, K q ) chosen accord- 
ing to pa corresponds to a uniform proper (/-colouring of G. In this case ele- 
ments of Aix{K q ) consist of all partitions of V(K q ) into two classes as near 
equal in size as possible, and an easy calculation gives that for all colours k 

a \(k) = r and a\{k) 



2\q/2\ AV ' 2Lg/2j 

so that in particular a A (/c) = a~^(k) = 1/q for q even, and we get the following 
corollary of Theorem 11.21 
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Corollary 1.5 Fix q G N. There are positive constants C\, c 2 , C3 and C4 (de- 
pending on q) such that for every d-regular, bipartite graph G on N vertices, 
the following statements hold. If \ is a uniformly chosen q-colouring of G 
and e> c\j \fd then for q even 



Pr ( 3k G V(H) : 
and for q odd 



\x-\k)\ 



N 



>e)< c 2 e~ 1 2 



-lc)-c 3 e 2 N 



Pr(3ke V(H) : < — e 



Pr [3k G V{H) 



\x-\k)\ 
N 



> 



q-1 




< C 2 £~ l 2 



-lo-c 3 e 2 Af 



So for even q, almost all proper g-colourings of a regular bipartite graph are 
"almost equitable". Of course, by the symmetry of K q we have E{\x~ l {k)\) = 
N/q for all k in this case. 

The condition that G be regular can be relaxed quite a bit; we simply 
require that G has not too many low degree vertices, that the sum of the de- 
grees of high degree vertices is not too large, and that the difference between 
the sizes of the partition classes is not too great. 

Theorem 1.6 Fix H and A. There are positive constants c\, c 2 , C3 and C4 
(depending on H and A) such that the following statements hold. Let G be 
a bipartite graph on N vertices with bipartition classes £ and O (with \0\ > 
\S\). Let d be an arbitrary positive parameter. Let e satisfy e > C\^fh(G, d) 
where 



h(G,d) 



1 

d 



\{v E £ : d(v) < d}\ \0\-\£ 



N 



+ 



N 



{d(v)>d}- 



Then for each k G V(H) we have (HP, as well as with now 



c 



max 



logiV 



If G is c?-regular then h(G,d) — 1/d and so Theorem 11.61 is a generalization 
of Theorem 11.21 The proof of Theorem 11.61 follows the same lines as already 
described for Theorem 11.21 except that we now require a new upper bound on 



9 



Z\(G,H). In Section H] we modify the entropy-based proof of PJ to obtain 
the following, which is just what we need for Theorem II .6[ the proof of which 
is then given in Section [3j Here d(v) = \{u G V(G) : uv G E{G)}\ is the 
degree of vertex v, and we write w\(H) for w\(V(H)). 

Theorem 1.7 Fix H and A, and suppose that Aj > 1 for all i G V(H). Let 

G be any bipartite graph on bipartition classes S and O, with \0\ > \£\, and 
let d be an arbitrary positive parameter. Then 

Z A (G,H) < WA (tf)l{^(«0<<OI J] Z A (K d{v)id ,H)l 

Note that if G is (/-regular then Theorem 11.71 reduces to (J4]). Note also that 
the condition imposed on the Aj by Theorem 11.71 is not restrictive: if A' is 
obtained from A by multiplying all Aj G A by the same positive constant 
then p\(Ni(f) = •) = PA'(Nx(f) = •) and so we may assume without loss of 
generality that min{Aj : i G V(H)} > 1. 

Theorem 11.61 is only of interest in situations where h(G, d) can be shown 
to be small (as, for example, when G is d- regular). A natural situation where 
we can say something about h(G, d) is in percolation. Given a graph G and 
a parameter < p < 1, let G p be a random subgraph of G obtained by 
deleting each edge independently with probability 1 — p (so the probability 
that G p = H is p\ E ( H )\(l - p )\E(G)\-\E(H)\y A coro n ary Q f Theorem O] (which 

we will prove in Section H]) is the following "phase transition" phenomenon for 
percolation on a regular bipartite graph. If G is an n-regular bipartite graph 
and p is much greater than 1/n, then the typical appearance of a pA-chosen 
//-colouring of G p is similar to that of a j?A-chosen //-colouring of G, whereas 
if p is much smaller than 1/n, then as long as there is some k G V(H) with 
Aa;/wa(//) ^ [%(&), a~^(k)}, these two objects have different appearances. 

Corollary 1.8 Fix H and A. Let f(n) = w(l). There is a function g(n) = 
o(l) (depending on f(n)) such that if {G n }^ =1 is a sequence of n-regular 
bipartite graphs and p satisfies p > f(n)/n, then with probability at least 
1 — g(n) the graph G™ satisfies that for each k G V(H) we have 

p A (s(k,f) G h(g(n))) <g(n) 

and 

Ph(k) G [al(k) - g(n),al(k) + g(n)] . 
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If on the other hand p < l/(/(n)n) then with probability at least 1 — g(n) we 
have that for each k G V(H), 



Pa 



s(k,f)- 



w a (H) 



< g{n) > 1 - g(n). 



and 



PA(k) 



X, 



< g(n). 



w A (H) 

For the multicast model (Example B), for example, we have 



aX(0) = a+(0) 



2(E 



i<[k/2\ 



A' 



+ 



1 



2(E 



i<\k/2\ 



A< 



> 



and so Corollary 11.81 shows a phase transition for this model. For the uniform 
g-colouring model (Example C), on the other hand, Corollary 11.81 gives no 
information about what happens as p crosses 1/n. 



2 Proof of Theorem 11.71 

We will initially assume that for all i G V(H), we have A; G Q. Under this 
assumption, we can relate Z\(G, H) to a uniform model. We repeat an idea 
used in [B] and first introduced in Let C be any positive integer with the 
property that CAj G Z for each i G V(H). Let Hj( be the graph obtained 
from H by the following process: replace each vertex i with a set Si of size 
CXi, replace each edge ij {i ^ j) with a complete bipartite graph between 
Si and Sj, and replace each loop ii with a complete looped graph on Si. It 
is easy to check that for any iV vertex graph G we have 

We now bound |Hom(G, H^) \ using an entropy approach that was used in [7] 
to upper bound the number of independent sets in a regular bipartite graph, 
and was generalized in [6] to bound |Hom(G, H) \ for arbitrary H and regular 
bipartite G. We very briefly review the necessary entropy background here; 
see for example [9] for a more detailed treatment. 
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For a discrete random variable X, let R(X) be the support of the mass 
function of X. Define the entropy of X to be 

H(X)= ~P(X = x)\ogP(X = x), 

xeR.(x) 

where here, and throughout the rest of this paper, logarithms have base 2. 
We may think of H(X) as a measure of the randomness of X or as the amount 
of information it contains. The conditional entropy of X given the discrete 
random variable Y is given by 

H(X\Y)= P ( Y = y) E -P(X = x\Y = y)\ogP(X = x\Y = y). 

y£R(Y) xdR{X) 

Here are the basic facts about the entropy function that we will need. The 
inequality that makes entropy useful as a tool for enumeration is 

H(X)<log\R(X)\ (6) 

with equality if and only if X is uniform. For a vector (X\, . . . , X n ) of random 
variables (itself a discrete and finite valued random variable) we have a chain 
rule 

H(X U ...,X n ) = H(X 1 ) + H(X 2 \X 1 ) + ... + H(X n \X u X n ^). (7) 

For random variables X, Y and Z we have 

H(X\Y)<H(X) and H(X\Y, Z) < H{X\Y) (8) 

(so dropping conditioning does not decrease entropy). Finally, we have con- 
ditional subadditivity: 

H(X U . . . , X n \Y) < H{X X \Y) + H(X 2 \Y) + ... + H(X n \Y). (9) 

Now let / be a uniformly chosen element of Hom(G,Hj(). By (JTj) the 
entropy of / satisfies 

H(f) = H(f(S)) + H(f(0)\f(S)). (10) 
We upper bound H{f{0)\f{£)) using flU and ©: 

H(f(0)\f(£))<J2H(m\f(N(v))) (11) 

veo 

12 



where N(v) = {u G V(G) : uv G E(G)} is the neighbourhood of v. We 
upper bound H(f(S)) using a form of Shearer's Lemma [I] derived from 
Radhakrishnan's proof of same (see for example [8J). Put a total order < 
on the vertices of G. For each v G O with N{v) = {ni, . . . ,nM v )} where 
rii < ... < ^cz(u) we have, by (J7J) and (jSJ), 

H(f(N(v))) = ^(/(^/(n^),... J(nO) 

8=1 

> £jT(/(n,)|{/(ii) :«<»*}) 
i=i 

and so 

> ^rfH^(/H|{/(«):«<W) 

= £)(d + ( rf H - <0 W/(«OI{/(«) '- u < W) (12) 

where d is any positive parameter. Since by (j7]) again we have 
J] :«<«;}) = #(/(£)) 

we rearrange the terms of ( Fl2|) to get 

#(/(£)) < ^^(^)) + E f 1 " : M < 

(13) 

We combine (JTUl) . ( TlTT) and ( fl3l) to upper bound H(f) as the sum of 

- J2 (H(f(N(v))) + dH(f(v)\f(N(v)))) (14) 

veo 

and 

E f 1 " ^r) #(/HIU(«) = « < "})■ (15) 

We deal first with flU). Fix «gO. For each A G V^tf)^ that occurs as a 
value of f(N(v)), let be the probability that A occurs and let e(A) be 
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the number of possible ways of assigning an image to v given that f(N(v)) 
takes value A. Expanding out the entropy terms we have 

H(f(N(v)))+dH(f(v)\f(N(v)) < £p(A)tog^£ ( 16 ) 

< logJ2e(A) d (17) 

A 

< \og\Rom{K d[vld ,HZ)\ (18) 
= \og(C d ^ +d Z A (K d(v)4 ,H)). (19) 

We use (jBJ) to obtain (ITS]) and Jensen's inequality for (IT7|) . and the equality 
in (Ti~9~]) follows from (JSJ). To see (TT8|) note that we specify an element of 
Kom(K d / v -) td , i/^ ) by first choosing the restriction A of the homomorphism 
to the partition class of size d(v) and then for each of the remaining d vertices 
choosing the value independently from e(A). Summing over v G O we see 
that ([HJ) is bounded above by 

log (C^\°\) + J2 1^ (^W, ■ (20) 

veo 

For (TT5T) . if d(w) < d, we upper bound 

H(f(w)\{f(u) :u<w})< log \V{Hl)\ = log(Cw A {H)) 

using (jUJ) and (jSJ). If c/(w) > ti then we need a lower bound on H(f(w)\{f(u) : 
u < w}). Since / is a homomorphism, there is at least one z such that / can 
take values in Si. Fix one such. If we add the condition that f(w) G Si then, 
because the vertices of Si are indistinguishable, f(w) becomes uniform and 
its entropy is the logarithm of \Si\. That is, 

H(f(w)\{f(u):u<w}) > H(f(w)\{f(u):u<w},{f(w)ESi}) 

= hgCXi 
> logC, 

the last inequality following from Aj > 1 for all i G V(H). It follows that 
(|T5!) is bounded above by 

log (cW-^w A (H)\^ d M <d ^ . (21) 
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Putting (JUD and into ([TO]), using H(f) = log |Hom(G, H%)\ (since / is 
uniform) and combining with ([511 , we obtain Theorem 11.71 for rational Aj's. 
By continuity, this bound remains valid when the Aj's are not necessarily 
rational. 



3 Proof of Theorem 1.6 



We begin by using Theorem 11.71 to put an upper bound on Z A (G,H). We 
first consider those v E O with d(y) > d. For each of the 4^^' ordered 
pairs A ~ B of subsets of H, the contribution to Z A (K d { v \ d ,H) from those 
/ with the partition class of K d ( v ^ d of size d(v) mapped to A and the class 
of size d mapped to B is at most 

w A (A) d ^w A (B) d < (w A (A)w A (B)) d w A (H) d ^- d 

and so 

Z A (K d(vU ,H) < A\ y ^w A (H) d ^- d VA (H) d . 
Similarly, for those v E O with d{v) < d we have 

Z A (K d{v)jd ,H) < Z A (K d ,d, H) < A\ v ^ VA (H) d . 

It follows from Theorem 11.71 that Z A (G,H) is upper bounded by 

77 A (i7)l°l4 M ^w A ( J ff)l^ e£:d ^ <d >l + 3^ 6 o(^)-rf)i{ d w>d} 
and so, using \0\ = N/2 + (\0\ - \S\)/2, 

Z A (G, H) < r] A (H)^C Nh{G ' d) (22) 

where C is a positive constant depending only on H and A. On the other 
hand, we get a lower bound (with any w A (A)w A (B) = T] A (H), and using 
Xi > 1 for all i E V(H)) by 

Z A (G,H) > w A (AfW(B) 101 

> (w A (A)w A (B)f l (23) 

N |OI-|£ 

= r] A (H)~7] A (H) — . (24) 

In (1231 we are using \0\ > \£\. 
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We now use Q22]) and fl2U) to prove fl2J). Fix fc G V(fT) and an integer iV fc 
satisfying < N k < N and 



N 



G [0,al(k)-e)U(a+(k)+e,l] (= 4(e)) . 



Write c k (N k ) for the contribution to Z\(G,H) from those / G Hom(G, ) 
with |/ _1 (/c)| = iVjfc. We aim to obtain an upper bound on c k (N k ) (via fT22]) ) 
which is substantially lower than the lower bound (pMJ) , indicating that this 
term does not contribute greatly to Z&(G, H). 
We begin by considering N k for which 



for some s' satisfying e < e' < 1 — a\{k). For any 5 > let A(fc,5) be 
obtained from A by replacing X k with (1 + 5)X k and leaving all other Aj's 
unchanged. By ( )22l) we have 



where now the constant C depends on 5 as well as on H and A. 

Before proceeding, we need to understand r]x{k,s){H). Viewed as a func- 
tion of 5, the quantity WA(k,8)(A)wA(k,8)(B) (for (A,B) G M.\(H)) is of the 
form a + b5 + c5 2 where a = r] A (H), b = WA(A)X k l {k( z B} + WA(B)X k l {k( z A} 
and c = Xll{ ke AnB}- From this formulation we can easily identify that 
set ^ S^(k,H) C M A (H) with the property that for all 5 > 0, all 
(A, B) G -M A (#) and all (A', B') G S^(k, H) we have w A{M) (A , )u;A( M )( J B , ) > 
WA( kt s)(A)wA(k,s)(B): S^(k,H) consists of all those (A', B') G M.a(H) for 
which b is maximum and (subject to this condition) c is maximum. This 
latter condition simply means that if some of the pairs that maximize b have 
c > we only take those pairs, and if they all have c = we take all pairs. 

It is easily seen that there is 5 k > (depending on H and A) with 
the property that for all < 5 < 5 k and (A',B') G S^[(k,H) we have 
(A',B') G MA(k,s){H). Choose one such, (A + ,B + ), arbitrarily. Note that by 
construction 



7 := = al(k) + e' 



(l + 5) N *c k (N k ) < ZA {k ,&)(G,H) 



(25) 



a%{k) 
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Now combining (j24j) and (1251) and choosing 5 < 5£ we have 

pa(I/- 1 (A0I = MO = 



Ck(N k ) 



Z A (G,H) 

i 

(26) 



< C h(G,d)N I WA(k,S)(A + )w A{k , S )(B+) 

~ w A (A+)w A (B+){l + S) 2 ^+^ 



N_ 

2 



where, by our restriction on 5, C may be taken to depend only on H and A. 
Our aim is to show that there is a positive constant c (depending on H and 
A) such that for all < s' < 1 — a>X{k) we can find a < 5 < 8£ for which 

W A(k ,5)(A + )w A(ki S)(B+) <2 -ce' 2 ( 27 ) 



w A (A+)w A (B+)(l + <5)2K(fc)+ £ ') 



Combining this with f l26|) we see that if e > c^/h(G, d) for some suitably large 
positive constant c (depending on A and H) then for all e < e' < 1 — a^(fc) 
for which a + (k)N + e'iV is an integer we have 

p A (If-^k)] = a + (k)N + e'N) < 2~ c ' £ ' 2n 
for a suitable positive c', and so 

P A {\f^ 1 (k)\ > a + (k)N + eAT) < ^2"^ 

£>eiV 

< 2-c'e 2 v 2~ 2lc ' £ 

< d'e- x 2- d * N (28) 



for suitably large c" (depending on c'). An almost identical argument (the 
details of which we leave to the reader) yields 



p A (\r\k)\ < a~(k)N - eN) < c"e^2' c ' £2N (29) 



for e > cy/h{G,d). Combining (J28J) and (J29J) gives ©. 

We now turn to (12?]) . Observe that it is enough to prove (I2~T1) for all 
< e' < £q, where Eo < 1 — a^(k) may be any constant (perhaps depending 
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on H and A). Indeed, for any e' > Sq we know that there is a choice of 



5 < 5£ for which 



WA(k,6){A + )w A{kiS )(B + ) W A ( k ,S)(A + )w A ( k ,5)(B^ 



to A (#)w A (B+)(l + i) 2 K(^') w A (A+)w A (B+)(l + 5)2(4(fe)+ e o) 



Setting c' = ce 2 , we have 2~ ce o < 2~ c ' £,/ for s' > s Q and 2 _ce ' 2 < 2~ C ' E ' 2 for 
e' < So, so we may replace c with c' to obtain the result for the full range of 
s' . From now on we will assume that s' < s , for a certain s that will be 
specified later. 
Setting 

Afcl{fceA+} _ Afel{jt e B+} 

7A _ 2«; A (A+) ' lB ~ 2w A {B+) 

(so a\{k) = 7a + 7b) the left-hand side of ([27]) becomes 

K(^ + ) + ^A fc l {fc gA+}) (^a(£ + ) + (5A fc l {fc gB+}) , . 

(1 + 5)27A+e' WA (A+) X (1 + 5)^+^(5+) ' 1 j 

If either = {A;} or fc then the first term of (J30j) is (1 + 5)~ 6 ' so that 
in this case we have that for any 5 > depending only on H and A, and any 
< s' < 1, 

(w A (A+) + (5A fc l {fcgA + } ) / / 2 

where c is a positive constant depending on H and A (the last inequality 
using e' < 1). If G A + and \A + \ > 1 then the first term of (13"UI) takes the 
form 

w A (A+) + 5A fc < l + 5(A fc / WA (A+)) 



(1 + 5) 2 ^+e' WA (A+) ~ l + S(2 lA + s') 

5s' 



1 



l + 5((\ k /w A (A+))+s>) 



< (32) 

with f )32|) valid for sufficiently small e'. Now taking 5 — s' (having chosen Sq 
small enough that this choice is allowed, and that (l32l) holds), we get a bound 
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of 2~ C£ on the first term of (1301) . where c is a positive constant depending 
on H and A only. 

Repeating this analysis for the second term of fl30|) . we obtain ( 127]) and 
thus ©. 

Applying © with e = c^(log N)/N (if (log N)/N > and £ = 

c^/h[G, d) (otherwise), where c> c\ satisfies c 2 c% > 1, we easily obtain (j3J), 
based on the observation that in both cases 

E A (s(k, /)) < (a+(&) + e) (l - e^"^-^) + c 2 e- l 2^ N 
with a similar lower bound involving a^(k). 

4 Proof of Corollary 11.8 

We assume throughout that |V^(G n )| = N (a. function of n) and that G n has 
fixed bipartition £ U O. 

We begin with the p = u(l/n) regime. We take 

d = np — \j2xnp 

with x = \/ f{n). The choice of x is driven by the aim of making all of 
the terms of h(Gp,d) be o(l), with probability 1 — o(l); this is enough for 
both statements of the corollary in this regime. Note that since \S\ = \0\ we 
immediately have (\0\ — \S\)/N = o(l). 
By our choice of x 

\/2xnp < — 

(for large enough n) and so d > and 1/d = o(l). 

For a given vertex v E S, let d(v) be its degree in This is a binomial 
random variable with parameters n and p, and so by standard Chernoff-type 
bounds (see for example [U Appendix A]) we have 

P(d{v) <d)< e~ x . 

(The specific bound we are using here is 

P(Bin(n,p) - np < -a) < e~ a2/2pn 
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for a > 0.) The distribution of vertices from £ which have degree smaller than 
d is therefore binomial with parameters N/2 and p' < e~ x . The expected 
number of such vertices is at most Ne~ x /2, and by Markov's inequality the 
probability that there are more than Ne~ x+ ^ x /2 such is at most e^^. Since 
x = cu(l), this is o(l), and so with probability 1 — o(l) we have 

\{v e £ : d(v) < d}\ = 

N [ '' 

It remains to consider S := Yl {d(v) — d : v £ O, d(v) > d}. We have 

E ( s ) = {d(v)l{ d (v)>d}) - dE (l{ d (v)>d}) 

veo 



and so 



veo \j>d \3 ' 

< N(np-d + de' x ) 

< N ^2xnp + npe~ x ^ , 



*(||^&3(=H)) 





for large enough n (again using d > np/2). Again by Markov's inequality 
with probability 1— o(l) we have S/dN = o(l) and so with probability 1— o(l) 
we have h(Gp,d) = o(l), as required. 

We now deal with the p = o(l/n) regime. The probability that a particu- 
lar vertex is isolated in Gl is (1 — p) n > 1 — 2f(n) (for large enough n), so the 
number of non-isolated vertices in S is a binomial random variable with pa- 
rameters N/2 andp' < 2f{n). By the Cher noff bound, asymptotically almost 
surely (with probability tending to one as n tend to infinity) £ has fewer than 
2f(n)N non-isolated vertices and so also asymptotically almost surely G™ has 
fewer than 4f(n)N non-isolated vertices. For each k e V(H), the number of 
isolated vertices mapped to k is a binomial random variable with parameters 
m > N(l — 4/(n)) and p" = \ k /w^{H) and so (again by Chernoff bounds) 
asymptotically almost surely there are at least N(l — 5f(n))\ k /w\(H) ver- 
tices of Gp mapped to k. Since J2kev(H) ^k/w\(H) = 1, we also have that 
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asymptotically almost surely there are at most 



\w A {H) \ wa{H)JJ 

vertices of G™ mapped to k. This completes the proof of the corollary. 

5 Proof of Theorem 11.41 

The graph Ga will be a random d- regular bipartite graph on JV = c d/iogd 
vertices (where c > 1 will depend on the particular H and A under consid- 
eration). A standard method of constructing such a graph is as follows. We 
begin with a set of size Nd consisting of Nd/2 type I vertices {uij : 1 < % < 
N/2, 1 < j < d} and Nd/2 type II vertices : 1 < i < N/2, 1 < j < d}. 
We then choose a uniformly random perfect matching from the type I vertices 
to the type II vertices, and turn this into a (i-regular bipartite multigraph on 
A^ vertices with bipartition classes £ = {ui, . . . ,un/ 2 }, O = {v i, . . . , vn/ 2 } by, 
for each i — 1, . . . , N/2, identifying u^i, . . . , u^d with Ui and v^i, . . . , v^d with 
Vi. Finally, we condition on the result being a simple graph. This process 
generates a <i-regular bipartite graph on A^ vertices with bipartition classes 
£, O, uniformly (see for example [13]). 

O'Neil [UJ showed that the probability that the multigraph produced by 
this process is simple is (for large enough d) at least e~ d2 ^ 3 . It follows that 
if we establish that the multigraph produced (before conditioning on being 
simple) has a certain property with probability at least 1 — e~ d2 (say), then 
there is a simple cZ-regular graph with that property. 

We want to establish that for large enough d the multigraph has a number 
of desirable expansion properties. First, we want to show that for each 
Clogd < j < 3 log dN/d (for some constant C > 0, depending on c), every 
subset of £ of size j and every subset of O of size j has at least aj distinct 
neighbours where a = d/(C log d). For a particular such j, the probability 
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that the graph fails to have this property is (by a union bound) at most 

2a i /o^,„ \ i d 



<?)( 



N/2\ (ajd) jd < feN_Y aJ [2al\ 
aj J (Nd/2) jd ~ \2aj) \N J 



zjd ( 2jd 

g C log d 



CN log d 



2jd ( 2nd 
< e cio g d ' 



CN log d 



jd/2 



(for large enough d, depending on C) with the first inequality using ( n ) < 
(en/r) r . For j > dlogd we bound 2jd/(CNlogd) < 1/2 (valid for C > 12) 
so that for large enough d (depending on C) 



2jd ) < \.r'"< 

CNhgd 



jd/2 



e 



-2d 



For j <d\ogd we instead bound (2dj) /(CN log d) < d 2 /N (valid for C > 2). 
We now have 

ajd / 2jd \ jd / 2 ( n .„ , j^ 2 logcl f— jd 2 \ogc 

eClogd ~Ft att j — ex P i 2jdlogd — — > < exp 



CA^logrfy ~ ° 2\ogd J " ^[ 31ogd 

(again for large d, recalling iV = c d ^ logd ), which is at most e~ 2d2 for j > Clogo? 
for suitable C depending on c. Since there are at most N = c dlogd choices for 
j, the probability that the graph fails to have the desired property for some 
j is at most e~ d . If the process results in a simple graph, then we trivially 
get the same expansion for subsets of £ or O of size at most Clogd, since 
for 1 < j < Clogd there is a trivial lower bound of d on the neighbourhood 
size of a set of size j, and we have d > jd/ (Clogd) for j in this range. 

Next we establish that the graph has the property that for every subset 
A of £ of size 3N log d/d and every subset B of O of size 3N log d/d, there 
is an edge joining a vertex of A to a vertex of B. By a union bound, the 
probability that the multigraph fails to have the property is at most 

N ^ 1 [Ni ' 2 ~ md) *»* < exp {WN lo g (e/(2«) - 2^N} 

/3Nd 



f3N J (Nd/2) 



where j3 = 3 log d/d. With iV = c d ^ logd , this is at most e d2 for large enough 
d (depending on c). We have shown the following. 
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Lemma 5.1 Fix c > 1. There are d > 1 and positive C, both depending 
on c, such that for all d > do there is a d-regular, bipartite graph Gd on 
N = c d//logd vertices with bipartition classes £ and O satisfying the following: 

1. Every subset of £ or O of size j , with 1 < j < 3N\ogd/d, has at least 
jdf(Clogd) neighbours. 

2. Every pair of subsets each of size 3N log d/d, one from £ and one from 
O, have an edge between them. 

We now fix such a Gd and study Z A (Gd, H). Given / 6 Hom(G ( i, H) set 
£(f) = {ke V{H) : \f-\k)n8\ > 3Nlogd/d} 

and 

O(f) = {ke V{H) : ir 1 ^) n O] > 3N\ogd/d}. 

Clearly both £ (/) and 0(f) are non-empty, and by Lemma I5.1[ we have 
8(f) ~ O(f) (that is, everything in 8(f) is adjacent to everything in 0(f)). 
So we can partition HoHi(Gd, H) into classes indexed by pairs (A, B) with 
A ~ B. Write C(A, B) for the class corresponding to (A, B). We want to 
establish that for (A, B) e M.^(H) we have 

E Mf) = (1 + o(l)) VA (H) N /> (33) 
feC(A,B) 

while for all other (A, B) we have 

E »a(/) = o (va(H) n / 2 ) , (34) 

where all asymptotic terms are (unless stated otherwise) as d — > oo. From 
this we see that 

Z A (G d ,H) = \M a (H)\(1 + o(1)) Va (H) n '\ 

and that all but a vanishing proportion of Z A (Gd, H) comes from pure-(v4, B) 
colourings (with (A, B) £ M. A (H)) in which £ is mapped to A and O to S, 
with each such (A,B) contributing equally to Z A (Gd, H); this is enough to 
give the first part of Theorem II A\ Indeed, fix (A, B) £ Ai A (H). A propor- 
tion (1 + o(l))/\M. A \ of Z A (Gd,H) is obtained by independently colouring 
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8 from A and O from B according to the given weights. Fix k G A. We 
claim that with very high probability, a proportion very close to Xk/w A (A) 
of E gets mapped to k. Set p = Xk/w A (A) and m = N/2. The number £4 of 
vertices of S mapped to k is a binomial random variable with parameters m 
and p. So by Tchebychev's inequality, 

Pr f | CTfe — pm\ > logmA/mp(l — ~pf) < — 5 — • 
V / log m 

This shows that the proportion of vertices mapped to k in a pure-(v4, B) 
colouring is very close to 

2w A (A) 2w A (B) 

with high probability. Applying this with (A, B) = (A + , B + ) and (A, B) = 
(A~,B~), the first part of Theorem 11.41 follows. 

The lower bound in ( 133|) is obtained by considering pure- (A, B) colourings 
with £ mapped to A and O to B. To establish (134"]) and the upper bound 
in fix < j < 3N log d/d let q = \V(H)\, and assume that d is large. 
We consider the contribution to J2feC(A B) w ^(f) fr° m those / G C(A,B) in 
which, for each k G" A U B, we have at most j vertices mapped to and we 
have at least one k' G^ AUB whose preimage has size j. To bound the contribu- 
tion from these /, we first bound the number of ways of locating the vertices 
that are mapped to k for each k ^ A U B by ^Xl?.<j (T) ) • ^he contribution 

to the sum of the weights from these exceptional vertices is at most w\(H) q: > . 
For the contribution from the remaining vertices, we deal separately with 
the cases (A, B) G M A (H) and (A, B) <£ M A (H). For (A,B) g M A (H), we 
simply upper bound the contribution by (w A (A)w A (B)) N ^ 2 , leading to 

Mf) < (w A (A) WA (B)) f (e(T)1 (^wr 

feC(A,B) V i<j ^ ' ) 

= o( VA ( H r>), 

as required. For (A, B) G A4 A (H), consider a k' that has preimage size j. We 
claim that there are at least jd/(2Clogd) vertices which, in the specification 
of /, need to be mapped to A U B and which are adjacent to at least one 
of the j vertices mapped to k' . Indeed, by Lemma 15.11 the neighbourhood 
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size of the j vertices mapped to k' is at least jd/(C\ogd), and at most qj 
vertices have been mapped to vertices from outside A U B, so there are at 
least jd/{C\ogd) — qj > jd/(2C\ogd) vertices that are adjacent to a vertex 
mapped to k' and need to be mapped to vertices from A U B. Since k' 
cannot be adjacent to everything in A, nor can it be adjacent to everything 
in B (else we would not have (A,B) e Ai^(H)), our choice on these at 
least jd/(2Clogd) vertices is restricted to a proper subset of A U B; the 
contribution we get from the remaining vertices (those mapped to A U B) is 
therefore at most 

N 

(w A {A)w A (B))^ 



id 



(1 + £?)2Clogd 

where e > (depending on H and A) can be chosen uniformly for all A, B. 
Combining these observations we get that 



f&C{A,B) 



\ _|_ e \ 2Clog,/ 



If j = 0, the right-hand side above is (w \(A)w \(B)) N I 2 . For j > it can be 
bounded above by 

1 

for some e' > (depending on H and A) for all j in the range 1 < j < 
3 N log d/d, as long as c is sufficiently small (recall N = c d//logd ). Summing 
over j gives the upper bound in fl3"3"|) . 

We now turn to the second part of Theorem 11.41 We take G' d to be the 
disjoint union of m copies of K^d where m = m(d) = oo(l). Fix k 6 ^(-£0- 
Let X be the number of vertices mapped to k in a pA-chosen if-colouring of 
G' d , and Xi the number mapped to k in the ith copy of K^d- Define a A {k) 
by E(X{) = 2da\(k), and note that Var(ATj) < d 2 . Since X = Y^Li^-i we 
have £"(X) = 2dma A (k) and Var(X) < md 2 . By Tchebychev's inequality, 

P(|X - 2dma A (k)\ > 2dme) = P{\X/2dm - a A (k)\ > e) < l/Ame 2 . 

So choosing e = o(l) with me 2 = co(l) (for example, e = 1/m 1 / 3 ), the 
probability that the proportion of vertices mapped to /c in a £>A-chosen H- 
colouring of G' d differs from a A (k) by more than o(l) is at most o(l). The 
claimed bound on s(k, f) follows, as does the estimate of p A {k). 
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